Feat: default OpenAI chat to Responses API where possible#66
Merged
Conversation
Prefer the OpenAI Responses API (`ApiFlavor.RESPONSES`) when discovery signals both flavors are available, and when the langchain factory has no other flavor to route with: - `UiPathBaseSettings.get_model_info()`: when multiple OpenAI entries remain after existing filtering, prefer the `responses` / `OpenAiResponses` flavor over `chat-completions`. - `get_chat_model()` (langchain): inside the OpenAI match arm, default `api_flavor=RESPONSES` when discovery doesn't specify one and the caller didn't either. The LiteLLM client keeps `chat-completions` as its fallback for the single-entry / `apiFlavor=null` case. It serves both completion and embedding paths from the same instance, and the `responses/` model prefix in `_resolve_litellm_model` would break embedding calls if Responses were the default on a UiPath-owned OpenAI embedding model that reports `apiFlavor=null` at discovery. The discovery-level tie-break in `get_model_info` still benefits that client whenever the backend advertises both flavors explicitly. Bumps both packages to 1.9.3. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Revises the PR in two directions: - `UiPathLiteLLM` now matches the langchain factory: when no `api_flavor` is discovered or supplied for an OpenAI model, the client defaults to `ApiFlavor.RESPONSES`. Existing `openai_*_client` test fixtures that rely on previously-recorded chat-completions cassettes now pin `api_flavor=CHAT_COMPLETIONS` explicitly; the `openai_responses_client` fixture and `OPENAI_RESPONSES_CONFIGS` continue to exercise Responses against their own cassettes. - To keep OpenAI embeddings working under the new default, the LiteLLM client's `embedding()` / `aembedding()` now pass the raw `self._model_name` (no `responses/` / `invoke/` / `converse/` route prefix) — those prefixes are completion-only. - Drops the `get_model_info()` responses tie-break that was part of the first commit. The LiteLLM default + langchain factory default together cover the user-visible behavior; the tie-break was redundant and narrowed the data model for a case already handled by the defaults. Related test updates: - `test_openai_defaults_to_responses` replaces `test_openai_defaults_to_chat_completions`; the litellm-model-name assertion now expects the `responses/` prefix on the default path. - Removes the three `get_model_info` tie-break tests added earlier. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Regenerated by the local test run after switching `UiPathLiteLLM`'s OpenAI default to the Responses API. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
LiteLLM 1.83.x drops the injected httpx client when its
acompletion→aresponses bridge fires: in
``completion_extras/litellm_responses_transformation/handler.py`` the
async ``acompletion`` path calls ``transform_request`` without
``client=kwargs.get("client")`` and then ``aresponses(**request_data)``
without re-threading it, so by the time ``response_api_handler`` runs
``kwargs.get("client")`` is ``None``. LiteLLM then builds a fresh
``httpx.AsyncClient`` with no base URL, no auth, and sends
``Authorization: Bearer PLACEHOLDER`` — which the UiPath gateway
rejects with ``Unable to extract claim sub_type from token``.
The sync LiteLLM Responses path doesn't hit this bug (the sync bridge
does forward ``client``), but routing async OpenAI callers through
Responses unconditionally regresses every async user until the
upstream fix lands.
Keeping the Responses default only in the langchain factory, which
goes through ``UiPathChatOpenAI`` / ``UiPathAzureChatOpenAI`` (the
OpenAI SDK, not LiteLLM), where sync and async both work. The
LiteLLM client keeps its existing chat-completions fallback and
its existing integration cassettes.
Versioning: langchain-only change, so only
``uipath_langchain_client`` bumps to 1.9.3. Core stays at 1.9.2.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ApiFlavor.RESPONSES) for OpenAI chat whenever discovery lets us. Two touch-points:UiPathBaseSettings.get_model_info()tie-breaks multiple OpenAI matches toward theresponses/OpenAiResponsesentry when discovery returns both flavors.get_chat_model()defaultsapi_flavor=RESPONSESfor OpenAI when neither discovery nor the caller specified one (handled inside the OpenAI match arm).chat-completionson purpose — the same client serves both completions and embeddings, and theresponses/model prefix in_resolve_litellm_modelwould break embedding calls on OpenAI embedding models that discover withapiFlavor=null. The newget_model_infotie-break still benefits this client when the backend advertises both flavors explicitly.1.9.3; langchain dep pinned touipath-llm-client>=1.9.3.Packages affected
uipath-llm-client(core) —get_model_infotie-break,base.pyimports.uipath-langchain-client— factory default for OpenAI chat.Test plan
ruff check,ruff format --check,pyrightpasspytest tests— 1521 passed, 736 skipped, 9 xpassedget_model_infoprefers Responses over Chat Completions when discovery returns both (BYOM and routing-form strings)get_model_infotie-break does not fire for non-OpenAI vendorsget_chat_modeldefaultsapi_flavor=RESPONSESfor OpenAI UiPath-owned (apiFlavor=null)api_flavorstill wins over the defaultOpenAiChatCompletionsstill routes as chat-completions🤖 Generated with Claude Code